Overlay: Use overlay-aware CLI version when analyzing PRs#3880
Overlay: Use overlay-aware CLI version when analyzing PRs#3880henrymercer wants to merge 19 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates how the Action selects the default CodeQL CLI version so that, when analyzing pull requests, it can prefer an enabled CLI version that already has cached overlay-base databases for the configured languages (to speed up overlay/incremental analysis), while still respecting feature-flag rollback constraints.
Changes:
- Extend the default CLI “version info” returned from feature flags to include a sorted list of enabled default versions (not just a single version).
- Add PR-aware logic in
setup-codeqlto optionally pick the highest enabled version that intersects with overlay-base DB cache entries (with dry-run telemetry support). - Thread the new version-info shape through callers and update unit tests and the changelog entry.
Show a summary per file
| File | Description |
|---|---|
| src/upload-lib.ts | Switches to the new getEnabledDefaultCliVersions API and passes rawLanguages through initCodeQL. |
| src/testing-utils.ts | Updates test fixtures/mocks for the new CodeQLDefaultVersionInfo.enabledVersions shape. |
| src/start-proxy.ts | Adapts to the new version-info shape by selecting enabledVersions[0] for proxy downloads. |
| src/start-proxy.test.ts | Updates stubbing to getEnabledDefaultCliVersions. |
| src/setup-codeql.ts | Implements overlay-aware default-version resolution for PR analyses (feature-flag gated) and threads rawLanguages. |
| src/setup-codeql.test.ts | Updates call sites for new rawLanguages parameter and adds unit tests for overlay-cache version filtering. |
| src/setup-codeql-action.ts | Updates to getEnabledDefaultCliVersions and passes rawLanguages (currently undefined). |
| src/init.ts | Threads rawLanguages through to setupCodeQL. |
| src/init-action.ts | Uses getEnabledDefaultCliVersions and passes rawLanguages derived from the languages input. |
| src/feature-flags.ts | Introduces CodeQLVersionInfo, changes default version info to enabledVersions[], and adds new overlay match feature flags. |
| src/feature-flags.test.ts | Updates tests to validate multi-version enablement ordering/fallback behavior. |
| src/codeql.ts | Threads rawLanguages into tool setup to support PR-aware version resolution. |
| src/codeql.test.ts | Updates tests for the new default-version info shape and new function signatures. |
| CHANGELOG.md | Adds an UNRELEASED entry describing the experimental overlay-aware default version selection. |
| lib/upload-sarif-action-post.js | Generated JS output (not reviewed). |
| lib/upload-lib.js | Generated JS output (not reviewed). |
| lib/start-proxy-action.js | Generated JS output (not reviewed). |
| lib/start-proxy-action-post.js | Generated JS output (not reviewed). |
| lib/resolve-environment-action.js | Generated JS output (not reviewed). |
| lib/init-action.js | Generated JS output (not reviewed). |
| lib/autobuild-action.js | Generated JS output (not reviewed). |
| lib/analyze-action.js | Generated JS output (not reviewed). |
| lib/analyze-action-post.js | Generated JS output (not reviewed). |
Copilot's findings
Comments suppressed due to low confidence (1)
src/start-proxy.test.ts:1029
- The stub variable is named
getDefaultCliVersion, but it actually stubsgetEnabledDefaultCliVersions. Renaming the local variable (and the later assertion) would avoid confusion and better reflect what’s being tested.
const getDefaultCliVersion = sinon
.stub(features, "getEnabledDefaultCliVersions")
.resolves({
enabledVersions: [{ cliVersion: "2.20.1", tagName: expectedTag }],
});
const path = await startProxyExports.getProxyBinaryPath(logger, features);
t.assert(getDefaultCliVersion.calledOnce);
sinon.assert.calledOnceWithMatch(
- Files reviewed: 14/26 changed files
- Comments generated: 3
| const version = await resolveDefaultCliVersion( | ||
| defaultCliVersion, | ||
| rawLanguages, | ||
| features, | ||
| logger, | ||
| ); | ||
| cliVersion = version.cliVersion; | ||
| tagName = version.tagName; | ||
| } |
mbg
left a comment
There was a problem hiding this comment.
Thanks for breaking this up into (reasonably) sane commits and a separate PR from the previous one! Overall, this looks like a pretty good first stab at this change and I don't see any major issues here.
I have left a bunch of detailed comments. There are a couple in particular about possible issues down the road that we might want to consider documenting or guarding against.
I am wondering if the second FF is a bit overkill and adds an extra layer of complexity on top of an already fairly complex change. I don't feel strongly about this, but the thought crossed my mind whether a reasonable thing to do here to reduce complexity / risk would be to consider breaking this PR up into three separate changes:
- The first commit that modifies some of the existing logic to return all FF-enabled CLI versions, but doesn't make use of this information. We can ship that change without a new FF.
- Then ship the new logic that's guarded by the second FF. I.e. everything except actually using the CLI version that we determined using the new process.
- Then ship the first FF + the logic that uses the CLI version determined using the new process.
| // version, or the one enabled by FFs. | ||
| const versionInfo = useFeaturesToDetermineCLI | ||
| ? await getCliVersionFromFeatures(features) | ||
| ? (await getCliVersionFromFeatures(features)).enabledVersions[0] |
There was a problem hiding this comment.
This is an interesting change. It might mean that the bundle release used by start-proxy and init could diverge. That's certainly not a problem right now, since we don't depend on them lining up. (I.e. we have thus far not had any reason to require the proxy to be of at least a particular version for changes made in the CLI/extractors.)
It could also mean that a base database is extracted using the version of the proxy that shipped with the corresponding CLI, and we then end up using a newer proxy for the overlay analysis on a PR. I don't see an immediate problem with it, but it is worth noting.
We might want to think about whether this is something we want to document somehow so that it doesn't become a problem if we ever want to make that assumption.
There was a problem hiding this comment.
I was originally thinking we'd do this in a separate PR, but let's just add the languages and analysis-kinds inputs to avoid confusion here.
There was a problem hiding this comment.
That solution is OK, but a concern here is that it takes the setup-codeql action a step away from an action that just handles installing the CLI in the sense that it now needs to be aware of what is going to be analysed (if anything). I think the changes are OK though because:
- The
languagesandanalysis-kindsinputs are optional (ish -analysis-kindshas a default). - We are still just installing the CLI, but for overlay-enabled Code Scanning analyses the new logic is unavoidable and that requires the information about the (expected) analysis.
There was a problem hiding this comment.
Indeed, since we're now basing our choice of CLI on the languages being analysed when doing a code scanning analysis, I think this is a necessity.
| getTemporaryDirectory(), | ||
| gitHubVersion.type, | ||
| codeQLDefaultVersionInfo, | ||
| undefined, // rawLanguages: currently, setup-codeql is not language aware |
There was a problem hiding this comment.
A potential issue down the line if we ever want to expand on setup-codeql and allow it to run before init. It might be worth documenting the implications of this on overlay analysis here.
Co-authored-by: Michael B. Gale <mbg@github.com>
666a561 to
b4ea7aa
Compare
I have a mild preference against splitting this PR up further now, but I agree it would have been better to introduce just the "dry run" changes in a separate PR. Happy to factor this into a separate PR if you have a strong preference. I would prefer to keep the "dry run" changes behind a feature flag though — we're making an extra API request to list the caches, and even though it's not much code, it's safer to roll it out with a flag. |
mbg
left a comment
There was a problem hiding this comment.
I have a mild preference against splitting this PR up further now, but I agree it would have been better to introduce just the "dry run" changes in a separate PR. Happy to factor this into a separate PR if you have a strong preference.
It's fine to keep it as-is. I don't feel strongly enough about this to push for splitting it up.
Otherwise the changes look good, thanks for addressing my feedback! I have left a few minor comments.
| `github/codeql-action/init` and `github/codeql-action/analyze` invocations. If specified, the | ||
| Action may use this list to select a CodeQL CLI version that is best suited to analyzing those | ||
| languages, for example by preferring a version that has a cached overlay-base database for the |
There was a problem hiding this comment.
Minor: I'd drop the "If specified, [..], for example by preferring a version that has a cached overlay-base database for the specified languages." part from this description. My thinking here is that:
setup-codeqlis mainly about installing the CLI.- So the main thing anyone using this should care about is the
toolsinput. - If
toolsis provided, I'd expect us to honour it. - If it is unspecified, then
languagesmay affect the CLI version we choose, but so do other things. - Therefore, I am not sure it's worth pointing out here explicitly.
There was a problem hiding this comment.
I think this is mostly relevant to us as we develop default setup, indeed it seems unlikely anyone would want to use setup-codeql in an advanced setup. I'm leaning on the side of keeping it to explain why it's there when setup-codeql is otherwise language agnostic.
| return languages; | ||
| } | ||
|
|
||
| /** Parses the `languages` input into a list of languages without checking if they are supported by CodeQL. */ |
There was a problem hiding this comment.
Minor: Really all this does is split a comma-separated string into an array, removes excess space characters, converts the strings to lower-case, and removes empty elements. The current description might suggest that something more specific to languages is happening here. How about:
Splits a comma-separated string into an array. Excess spaces are removed and all characters are converted to lower-case.
There was a problem hiding this comment.
That's a little implementation specific. Perhaps the main opposition is with "Parses"? I'll update that.
|
|
||
| Available options are the same as for the `analysis-kinds` input on the `init` Action. | ||
| default: 'code-scanning' | ||
| required: true |
There was a problem hiding this comment.
Since this input has a default, we're disallowing specifying analysis-kinds: null. I don't see why we'd want to allow that, hence disallowing it.
| // version, or the one enabled by FFs. | ||
| const versionInfo = useFeaturesToDetermineCLI | ||
| ? await getCliVersionFromFeatures(features) | ||
| ? (await getCliVersionFromFeatures(features)).enabledVersions[0] |
There was a problem hiding this comment.
That solution is OK, but a concern here is that it takes the setup-codeql action a step away from an action that just handles installing the CLI in the sense that it now needs to be aware of what is going to be analysed (if anything). I think the changes are OK though because:
- The
languagesandanalysis-kindsinputs are optional (ish -analysis-kindshas a default). - We are still just installing the CLI, but for overlay-enabled Code Scanning analyses the new logic is unavoidable and that requires the information about the (expected) analysis.
When analyzing PRs, prefer CLI versions that have cached overlay-base databases to speed up analysis time. However to ensure we can effectively rollback new versions, do not use a CodeQL version whose feature flag is disabled, even if this means running without overlay analysis.
This will be shipped via two feature flags:
overlay_analysis_match_codeql_version_dry_runlogs a diagnostic when the overlay-aware version differs from the latest enabled versionoverlay_analysis_match_codeql_versionuses the overlay-aware version when analysing PRsRisk assessment
For internal use only. Please select the risk level of this change:
Which use cases does this change impact?
Workflow types:
dynamicworkflows (Default Setup, Code Quality, ...).Products:
analysis-kinds: code-scanning.analysis-kinds: code-quality.Environments:
github.comand/or GitHub Enterprise Cloud with Data Residency.How did/will you validate this change?
.test.tsfiles).If something goes wrong after this change is released, what are the mitigation and rollback strategies?
How will you know if something goes wrong after this change is released?
Are there any special considerations for merging or releasing this change?
Merge / deployment checklist